R for Bio Data Science
November 28, 2023
Disease outbreaks are caused when infectious diseases spread rapidly across the world and they can be caused by bacteria, viruses, fungi and parasites.
Infectious outbreaks cause severe health treats globally such as:
COVID-19 2019-2022 (affected 562 million people worldwide)
SARS-CoV in 2003, severe acute respiratory syndrome
Influenza A (H1N1) in 2009-2010
Environmental factors can influence the disease transmission, these include socioencomic factors as well as region and geographic factors.
Their aim was to create a new dataset of infectious disease outbreaks from the Disease Outbreak News and Coronavirus Dashboard (WHO)
Our aim was to :
Final data set handling:
Outbreaks over time
Diseases with the most outbreaks: COVID-19, Influenza virus, Cholera
Frequency of zoonotic diseases has increased the last decade
Outbreaks frequency and Income status
The 20 top diseases -> almost equally distributed in income groups
Excluding Covid and influenza -> 66 % in low income or low middle income countries
Spatial distribution of outbreaks frequency
13 out of the 20 top countries -> Africa
3rd country with the most outbreaks -> USA
Findability, Accessibility, Interoperability, Reusability
The use of standardised naming (ISO-3166 and ICD-10) makes it possible to merge the data with data from other resources.
Access and tidy data (using tidyverse princibles)
Reproduce the plots provided in the article and add novel graphics
Added new data on income status, providing further knowledge on outbreaks
Did not include new DONs (webscraping) (No obtained permission, packages, time)
Current income status - not per year
Important for the development of targeting strategies
Group 12